IS 6200 (Part 4) :2008 WRhml-T $- RI-qTI-R544W ( W5Fi'7 @m jgfm ) Indian Standard STATISTICAL TESTS OF SIGNIFICANCE PART 4 NON-PARAMETRIC ( TESTS First Revision) ICS 03.120.30 0 BIS 2008 BUREAU MANAK OF BHAVAN, INDIAN STANDARDS ZAFAR MARG 9 BAHADUR SHAH NEW DELHI 110002 July 2008 Price Group 9 Statistical Methods for Quality and Reliability Sectional Committee, MSD 3 FOREWORD This Indian Standard (Part 4) (First Revision) was adopted by the Bureau of Indian Standards, after the draft finalized by the Statistical Methods for Quality and Reliability Sectional Committee had been approved by the Management and Systems Division Council. This standard was first issued in the year 1983 and was reviewed in the light of statistical changes required in the present context as also modification of some of the concepts, presentation in line with other Indian Standards on the subject. Statistical tests of significance are important tools in industrial experirnentrition and decision making. These tests tnay broadly be classified into two categories, namely, parametric tests and non-parametric tests. A parametric test is a test whose model specifies certain assumptions about the parameters of the population from which the sample is drawn. The statistical tests described in Part 1 of this standard are parametric tests as these tests are concerned with the hypothesis about the parameters of the population. For example, in the case of t-test and Ftest, it is assumed that the variances of the two populations are the same. Further, it is also assumed that the samples are drawn from a normal population. So the meaningfulness of the results of a parametric test depends upon the validity of these assumptions. It is, therefore, necessary to verify these assumptions before applying a pammetric test. But these assumptions are not ordinarily tested and are assumed to hold good. These assumptions, therefore, restrict the wider applicability of these tests. A number of statistical tests of significance are available which do not make the assumption of normality of the parent population. These tests are known as non-parametric or distribution-free tests and are based on ranking or ordering of observations or on number of observations exceeding or falling short of a given value. This standard describes some of the important non-parametric tests for testing whether the two samples are drawn from the same population. The two samples may be independent or related to each other. Some of the tests described in this standard are applicable when samples are drawn from related populations, while the others are applicable to samples from independent populations. Kolmogorov-Smirnov one-sample test is also included for testing whether a sample has been drawn from a given population. Indian Standard IS 6273 (Part 3) :1983 `Guide for sensory evaluation of foods: Part 3 Statistical analysis of data' also gives the applications of some of the non-parametric tests in the analysis of data arising from sensory evaluation experiments. For further details, reference may be made to this standard. The statistical tests described in Part 1 of this standard are normal, r and D tests. ~z-test and tests for normality are covered in Parts 2 and 3 respectively. In reporting the results of a test or analysis made in accordance with this standard, if the final value, observed or calculated, is to be rounded off, it shall be done in accordance with IS 2:1960 `Rules for rounding off numerical values (revised)'. IS 6200 (Part 4) :2008 Indian Standard STATISTICAL TESTS OF SIGNIFICANCE PART 4 NON-PARAMETRIC ( TESTS First Revision) 1 SCOPE 1.1 This standard (Part 4) lays down the following [csts of significance: a) b) c) d) e) f) Kolmogorov-Smirnov Kolmogorov-Smirnov Sign test, Wilcoxon matched-pairs Median test, Mann-Whitney U test, and run test. sign-rank test, one-sample two-sample test, test, a) These tests can be used even if the assumption of normality of the parent population is unrealistic. (The claim that `probability statements from N-P tests are exact probabilities regardless of the shape of popul~tion distribution' is not true; for example, the power of sign test depends on the form of the population of differences.) For sample size less than or equal to 6, there is no alternative but to use a non-parametric statistical tests unless the nature of the population distribution is known exactly. There are suitable non-parametric statistical tests in treating samples made up of observations from several different populations. None of the parametric test can handle such data without requiring to make unrealistic assumptions. Non-parametric statistical tests are available to treat data which are inherently in ranks as well as the data whose seemingly numerical scores have the strength of ranks, for example, where one is able to select one of the two characteristics in preference to the other. But this type of data cannot be treated by parametric method unless some realistic assumption is made about the underlying distribution. Non-parametric tests are much easier to learn, calculate and apply as compared to parametric tests. Non-parametric methods are available to treat the data which are simply classificatory. No parametric techniques can be applied to such data. b) !!3) Wald-Wolfowitz 1.2 For these tests, procedures to deal with small samples and large samples have been explained separately in this stmdard. The power efficiency of each test has also been indicated. 2 REFERENCES The following standards contain provisions, which through reference in this text, constitute provisions of this standard. At the time of publication, the editions indicated were valid. All standards are subject to revision and parties to agreements based on this standard are encouraged to investigate the possibility of applying the most recent editions of the sttindards indicated below: IS No. 6200 (Part 2): 2004 7920 (Part 1) :1994 (Part 2) :1994 Tifle Statistical tests of significance : Part 2 x~-test (second revision) Statistical vocabulary and symbols: Probability and general statistical terms (second revision) Statistical quality con~rol (second revision) c) d) e) f) 3 TERMINOLOGY For the purpose of this standard, the definitions given in IS 7920 (Part 1) and IS 7920 (Part 2) shall apply. 4 ADVANTAGES AND DISADVANTAGES NON-PARAMETRIC TESTS 4.1 The non-parametric advantages: OF 4.2 The non-parametric tests have some disadvantages also. If all the assumptions of the relevant parametric statistical model are met, then non-parametric tests are less sensitive. The degree of sensitivity is expressed by the power-efficiency of the non-parametric test. 5 BASIC CONCEPTS 5.1 Statistical tests of significance 1 are important tools tests have the following IS 6200 (Part 4) :2008 in decision-making. They are extremely useful in finding out whether, in the case of one population, the mean value differs significantly from certain specified value or whether. in the case of two populations, the mean values differ significantly from each other. Thus, it maybe desirable to find out whether a new germicide is more effective in treating a certain type of infection than a standard germicide, whether a new method of sealing light bulbs will increase their life or whether one method of preserving foods is better than another insofar as the retention of vitamins is concerned. In such cases, it would be necessary to examine w'hether the mean values can be deemed as same or different. There may also be cases where it maybe worthwhile to find out whether one inspector is more consistent than another or whether a new source of raw material has resulted in a change in the variability of the output, or whether the temperature of the bath in which the cocoons are cooked affects the uniformity of the quality of silk. In these cases it will be necessary to determine whether the variances are the same or not. 5.2 Formulation of Hypotheses respectively. This process of decision described in the table given below: HO True Reject HO Accept HO Type I error Correct decision making is Hi True Correct decision Type H error 5.3.3 Based on the distribution of test statistics used, it is possible to work out the probability of committing Type I error. The probability of committing Type I error is called level of significance (cx). The probability of" committing Type II error is called level of significance (~). It is not possible to minimize both these probabilities (risk) at the same time. Hence, assigning to it a chosen level of probability controls one of the risks, usually of the first kind. Generally the value for level of significance is chosen as 0.05 or 0.01, that is, 5 percent or 1 percent. This implies confidence level of 95 percent or 99 percent respectively. 5.4 The decision-making procedure involves the comparison of the calculated value of the statistic with the tabulated value. If the calculated value is greater than or equal to the tabulated value of the statistic, then H,, is rejected, thereby accepting H,; otherwise HO is not rejected. For practical purpose, HOnot rejected is taken as if it is accepted. 5.5 p-Value Significance Approach for Statistical Tests of using statistical tests of For taking a decision significance, the first step is to form the hypotheses, namely, Null Hypothesis (HJ and Alternative Hypothesis (Hl). 5.2.1 Null Hypothesis (HO) The procedure commonly used is to first setup a null hypothesis regarding equivalence (no difference). The question, on which the decision is called for, by applying the tests of significance, is translated in terms of null hypothesis in such a way that this null hypothesis would likely to be rejected if there is enough evidence against it as seen from the data in the sample. For example, a null hypothesis will be that the data follow a normal distribution. 5.2.2 Alternative Hypothesis (H,) Alternative hypothesis is a hypothesis that will be preferred in case the null hypothesis is not true. 5.3 Level of Significance 5.3.1 There are two kinds of errors involved in taking the decision based on the tests of significance, namely: a) 1"'~pe1 Error -- Error in deciding that a significant difference exists when there is no real difference. Type 11 Error -- Error in deciding that no difference exists when there is a real difference. p-Value approach for statistical tests of significance can also be used for this purpose. For performing any test of significance, the probability p of the test-statistic assuming, under the null hypothesis, the observed value and more likely values favouring the alternative hypothesis is calculated. Thisp-value is given alongside the observed value of the test-statistic in statistical packages used for performing such tests. If the calculated p-value is less than the chosen level of significance et, the null hypothesis is rejected, otherwise it is accepted. The advantage of this approach is that here there is no need to look for critical values of the test-statistic in Statistical Tables. 6 KOLMOGOROV-SMIRNOV TEST ONE-SAMPLE b) 6.1 The Kolmogorov-Smirnov one-sample test is a test of goodness of fit, that is, it is concerned with the degree of agreement between the distribution of a set of sample values and some specified theoretical distribution. It determines whether the sample values can reasonably be thought to have come from a population having a theoretical distribution with known parameters. 6.2 The test is accomplished by finding the theoretical cumulative frequency distribution which would be 5.3.2 Type I error and Type II errors are also called Error of the first kind and Error of the second kind 2 IS 6200 (Part 4) :2008 expected under the null hypothesis [F (X)] and comparing it with the observed cumulative frequency distribution [Sm (~]. Under the null hypothesis that the sample has been drawn from the specified theoretical distribution, it is expected that for every value of X, Sn (X) should be fairly close to F (X), that is, the differences between the theoretical and observed distribution should be small and within the limits of random errors. The point at which these two distributions, theoretical and observed, show the maximum deviation is determined. Let D = Maximum IF (X) - S,l (X)1 The Kolmogorov-Smirnov one-sample test treats individual observations separately and thus, unlike Xz-test for one sample, need not lose information through the combining of categories. When expected frequencies of some classes are less than 5, adjacent categories shall be combined before X2 may properly be computed. So the X*-test is less powerful than the Kolmogorov-$mirnov test. Moreover, for very small samples the X*-test is not applicable at all, but the Kolmogorov-Smirnov test is. These facts suggest that the Kolmogorov-Smirnov test may, in all cases, be more powerful than its alternative, the Xz-test. 7 KOLMOGOROV-SMIRNOV TEST TWO-SAMPLE interval, as shown in Table 2, is calculated help of standard normal probability tables. 6.5.3 The critical value of D at 5 percent with the level of significance is 1.36/4150 = 0.111 0. Since the calculated value of D is less than the critical value, the null hypothesis that the data follow normal distribution is not rejected. The 2-tailed p-value obtained by using SPSS package comes out as 0.0642>0.05, so this conclusion is further confirmed. 6.6 Power Efficiency 6.3 This value of D is calculated and compared with the critical value given in Annex A for desired level of significance. The null hypothesis is rejected if the calculated value of D is greater than the critical value; otherwise not. Alternatively, if the calculated p-value of the test-statistic assuming, under HO, the observed value and more likely values favouring Hl, is less than the chosen level of significance a, HO is rejected, 6.4 If the sample size (n) is more than 35, the critical value of D is 1.36/~n and 1. 63/~ for 5 percent level of significance for 1 percent level of significance. 6.5 Example I The mean and standard deviation of 150 observations of lekage current taken on 150 electric irons are 37.6 VA and 12.30 VA respectively. These values are given in form of frequency distribution in Table 1. It has to be tested whether the data follow normal distribution. Table 1 Frequency Distribution Current of Leakage Class-Interval (1) 10.5-15.5 15.5-20.5 20.5-25.5 25.5-30.5 30.5-35.5 35.5-40,5 40.5-45.5 45.5-50.5 50.5-55.5 55.5-60.5 60.5-65.5 Total Frequency (2) 6 8 12 16 21 30 18 15 10 9 5 150 7.1 The Kolmogorov-Smirnov two-sample test is used to test whether two independent samples have been drawn from the same population, that is, populations having the same distribution. The two-sided test is sensitive to any kind of difference in the distributions from which the two samples are drawn, that is, differences in location, in dispersion, in skewness, etc. The one-sided test is used to test whether or not the values of the population from which one of the samples is drawn is stochastically larger than the values of the population from which the other sample is drawn. 7.2 Like the Kohnogorov-Smirnov one-sample test, this two-sample test is concerned with the agreement between two cumulative distributions. The one-sample test is concerned with the agreement between the distribution of a set of sample values and some specified theoretical distribution. The two-sample test is concerned with the agreement between two sets of sample values. 7.3 If the two samples have in fact been drawn from the same population distribution, then the cumulative distributions of both samples may be expected to be fairly close to each other in as much as they both should show only random deviations from the population distribution. If the two sample cumulative distributions are too far apart at any point, this will suggest that the samples have come from different populations. Thus a 3 6.5.1 Null Hypothesis (H,) (HO) andAlternative Hypothesis The null hypothesis is that the data follow the normal distribution against an alternative hypothesis (Hi) that the data do not follow the normal distribution. 6.5.2 The area to the left of upper limit of each class- IS 6200 (Part 4) :2008 Table 2 Area Under Normal Curve to Left of Upper Limits of Class Intervals (Clause 6.5.2) Upper Limit of Cluss Interval (X) (1) Z= X-37.6 12.3 (2) ­1.80 0 ­1.39 -0.98 ­0.58 -0.17 0.24 0.64 I.05 1,46 1.86 2.27 F'(X) == AMMto the Let'tOfZ (3) ().035 9 ().082 3 0.1635 (),~xl () 0,4325 0.5948 0.7389 0.853 I 0.9279 0.9686 0.9884 Cumulative Frequency (Less Than X) (4) 6 14 26 42 63 93 Ill 126 136 145 150 S'(x)=:g (5) ().040 o 0.0933 0.1733 0.2800 0,4200 0.6200 0,7400 0.8400 0.9067 0.9667 I.0000 IF(X) -S. (A-I [cot 3- Col51 (6) 15,5 20.5 25.5 30.5 35,5 40.5 45,5 50.5 555 "60.5 65.5 0.0041 0,0110 0.0098 0.0010 0.0125 0.0252 0,001 I 0,013 I 0.0212 0.0019 0,0116 ThereforeD = MaximumIF(X)- S,,(X)1= 0.0252. large cnougl] deviation between the two sample cumulative distributions is evidence for rejecting the null hypothesis. 7.4 To apply the Kolmogorov-Smirnov two-sample test, a cumulative frequency distribution is obtained for each sample of observations, using the same intervals for both distributions. From the cumulative frequencies for both the samples, the cumulative step function values are calculated. Corresponding to each interval, then cumulative step function of ouc sumple is subtmcted from the other. The test focuses on the largest of these observed deviations. 7.5 Let S,,, (X) = the observed cumulative step function of one of the samples, that is, S,,l (X) = K7nl where K is the number of observations equal to or less than X. Let S,,z (X)= the observed cumulative step function of the other sample, that is, S,,2 (X) = K/rzz, then: D = maximum [S,,, (X) ­ S,,z (X)] is taken as test criteriti for a one-sided test, and D = maximum IS,,,(X) ­ S,,z(ml for a two-sided test. 7.6 This value of D is calculated and compared with [he critical value for desired level of significtince. The null hypothesis is rejected if the critical value is greater than the calculated value, otherwise not. Alternatively, if the calculated p-value of the test-statistic assuming, under Ho, the observed value and more likely values favouring Hl, is less than the chosen level of signitlcance U, Htj is rejected. 7.7 Small Samples When n] = nl and both n, and nz are 40 or less, Annex B may be used for testing the null hypothesis. This Annex gives various values of K{), which is defined as the 4 numerator of the largest difference between the two cumulative step functions, that is, the numerator of D. To read Annex B, one must know the value of n (which in this case is the value of n, = nJ and the desired level of significance. Observe also whether alternative hypothesis (Hi ) calls for a one-sided or a two-sided test. With this information, one may determine the significance of the observed data. 7.8 Example 2 Two machines were used for the production of cylinders in a workshop. A random sample of 10 cylinders from each of the two machines was selected to determine its diameter, The values of the diameter are given in Table 3. Table 3 Diameters .Muchine 1 ]21 118 126 124 127 119 121 125 121 120 of Cylinders, Machine 11 123 126 121 124 125 1~1 125 128 I23 124 mm It has to be tested whether the two machines producing the cylinders of same diameter. 7.8.1 Null Hypothesis (H,) (HO) and Alternative are Hypothesis The null hypothesis is that the two machines are producing the cylinders of same diameter against the IS 6200 (Part 4) :2008 alternative hypothesis (Hl) that the two machines not producing the cylinders of same diameter. 7.8.2 The cumulative frequency distributions the samples are given in Table 4. are 7.9.2 One-Sided Test of both When q and n2 are large, and regardless of whether or not n, = nz, the value of D for one-sided test is calculated as given in 7.5, that is, by using the following relation: D = Maximum It has been shown that # = 4D2 `,n2 7.8.3 The critical value of K~ from Annex B at 5 percent level of significance is 7. Since the calculated value is less than the critical value, the null hypothesis that the two machines are producing the cylinders of same diameter is not rejected. 7.9 Large Samples 7.9.1 Two-Sided Test When both n} and n~ are larger than 40, the critical values for the Kolmogorov­Smirnov two-sample test, at desired level of significance are calculated by the following relation: nl + nj 1,36 -- for 5 percent level of significance, r nlnz and for 1 percent level of significance. 1.63 H i %Jf2 The value of D as calculated in 7.5 is compared with this critical value. The null hypothesis is rejected if the critical value is less than the calculated value of D, otherwise not. Alternatively, if the calculated p-value of the test-statistic assuming, under HO, the observed value and more likely values favouring H], is less than the chosen level of significance rx, HO is rejected. [Sri,(X) -Sn~ (X)] nl + n2 is approximately distributed as chi-square (~2) with two degrees of freedom. Thus, calculating the value of %2 from the above relation and comparing it with the tabulated value of X2 for two degrees of freedom at desired level of significance, one may determine whether the null hypothesis is rejected, or not. Alternatively, if the calculated p-value of the teststatistic assuming, under HO, the observed value and more likely values favouring H,, is less than the chosen level of significance c!, HO is rejected. 8 SIGN TEST 8.1 For testing equality of means of two correlated populations (X, Y) against one-sided or two-sided alternatives, the parametric paired t-test cannot be used if the normality assumption of the underlying bivariate population is unrealistic. In this case, one can conveniently use paired-sample sign test. This test can be used even if instead of actual paired measurements (x,, yi), only data on whether xi is greater than or less than yi are available. Table 4 Cumulative Frequency (Clause 7.8.2) Class-Interval Frequency First Sample ~ Frequency Frequency (2) 1 1 1 2' 0 1 1 1 I 1 0 (3) I 2 3 5 5 6 7 8 9 10 10 Distributions Frequency Second Sample ~e Frequency (4) o 0 0 2 1 2 2 I I 0 1 (5) 0, 0 0 2 3 5 7 8 9 9 10 s,(x) co! 3 10 (6) 0.1 0.2 0.3 0.5 0.5 0.6 0,7 0.8 0.9 1.0 1.0 s,(x) Col5 10 (7) 0.0 0.0 0.0 0.2 0.3 0.5 0.7 0.8 0.9 0.9 1.0 D (1) 117.5-118.5 118.5-119.5 119.5-120:5 120.5-121.5 121.5-122.5 122.5-123.5 123.5-124.5 124.5-125.5 125.5-126.5 126.5-127.5 127.5-128.5 (8) 0.1 0.2 0.3 0.3 0.2 0.1 0.0 0.0 0.0 0.1 0.0 Therefore, D = maximum[S!(x)­ S?(x)I= 0.3 = 3/10. KLI = numeratorof D = 3. NOTE-- D is expressed as fi-actionHn where n, is the sample size. 5 IS 6200 (Part 4) :2008 8.2 Null Hypothesis and Alternative Hypothesis between the two processes against an alternative hypothesis (Hl) that there is a difference between the two processes. 8.5.2 The sign of difference Batch No. 8.3 Test-Statistic For each d,=.r, -y,, (plus or ignoring of n pairs of sample observations (x,, y,), i= 1, 2, . . .. n is calculated and only the sign minus) of each such difference is noted, the magnitudes of differences altogether. Sign of(XA--XB)12345678 + + _ _ (XA ­ XJ is given below: Here the null hypothesis is Ho: uC=Oagainstone(>f the ultematives H] : UC>O, or H2 : u, <0 or HI : u, # O. where Ucis the median of the population of differences [)=x­y. Number of `+ve' signs = 2 Number of ` ­ve' signs = 6 Therefore, x = Number of fewer signs= 2 Differences exactly equal to zero are ignored, and n reduced accordingly. Under the null hypothesis, it is expected that numbers of plus and minus signs should be equal. Therefore, the null hypothesis should be rejected if one kind of sign occurs predominantly in larger number than the other. 8.4 Small Samples This method shall be employed when the number of pail-s (n) in the sample is less than or equal to 25. From the data, the number of fewer signs (say x) is calculated. This calculated value of x is then compared with critical value ot'x corresponding to sample size n and desired level of significance. The critical values for this purpose are given in Annex C. Depending upon the alternative hypothesis whic!~ may determine whether the test is two-sided or one-sided, the appropriate critical va]uc may be selected. If the critical value is greater than the calculated value, the null hypothesis is rejected; otherwise not. Alternatively, if the calculated p-value of the test-statistic assuming, under HO, the observed value and more likely values t2~vouring HI, is less than the chosen level of significance a, H(, is rejected. 8.5 Example 3 The percentage yields (XA and XJ of a medicinal product in two chemical processes arc given in T~ble 5. It has to be tested whether the data provide evidence of difference between the two processes. Table 5 Percentage Batch N(). I 2 3 4 5 6 7 8 From .Annex C, the critical value of x for two-sided test for n = 8 and 5 percent level of significance is zero. Since the critical value is less than the calculated value, the null hypothesis is not rejected. Since the 2-tailed p-value comes out as 0.2891>0.05, the decision is upheld. 8.6 Large Samples When the sample size is more than 25, the normal assumption may be used, The distribution of x (the number of either positive or negative signs) is a binomial with mean `rip' and variance `np (1­p)' where n is the sample size and p is the probability of occurrence of positive or negative sign. Thus the standardized normal test is given by: Jx-"d JiiGi ~= 12X-4 IL' 8.6.1 The value of Z is calculated and compared with critical value for 5 percent level or 1 percent level of significance for a one-sided or a two-sided test, as given below: Signrjicance L(wel(u) 0.05 0.01 One-Sided Test 1.645 2.326 Two-Sided Test 1.960 2.576 under the null hypothesis of p = Y2 Yields of Medicinal X* 60.1 57.0 58.6 58.8 60.2 58.0 59.2 60.1 Product x,, 63.9 60.3 58.5 61.3 59.7 6[.0 60.8 60.2 The null hypothesis is rejected if the calculated value is greater than the critical vallle, otherwise not rejected. Alternatively, if the calculated p-value of the test-statistic assuming, under HO, the observed value and more likely values favouring H,, is less than the chosen level of significance rx, HO is rejected. 8,7 Example 4 In the export of iron ore, the iron content of the ore is determined at the loading port by using the sampling 6 8.5.1 Null Hypotllc.sis (HO) andA lternative Hyl~otlzesis (Hl) The data does not provide the evidence of a difference IS 6200 (Part 4) :2008 method, prevalent in the exporting country. Again. the same shipment is sampled by the impormng country by their own method for the detcrmlna[ion of iron content. The results of4t3 shipments ot'iron orc exporw from hldia to an overseas country arc given in Table 6. It is intended to find whether the two methods of sampling adopted by the exporting und illlportin: countries (in the estimation of iron content) am significantly different from each other. [t is assumed that there is no change in quality of iron ore during transit. Table 6 Iron Content of Iron Ore Shipments (1) I ~ in cstirnation of iron content by the two sampling methods against an alternative hypothesis (H[) that there is J difference between the two methods. 8.7.2 It can be seen that out of 40 differences, two are zero. [gnoring the zero's, from the remaining 38 cii[ferenccs, the number of positive signs is 16 and the negative signs is 22. x M = = number of fewer signs (+ve) = 16 sample size = 38 (under H[)) =1/2 P Iron Content at Loading Port (2) 64.74 64.53 64.28 64.97 63.62 65.62 64.46 65.03 64.80 64.53 65.46 65.48 64.90 65.10 65.24 65.25 65.50 65.61 65.52 64.62 64.14 65. !4 65.06 65.12 66.06 66.42 62.86 64.54 64.28 63.36 65.32 63.62 64.53 65.48 65.48 65.10 64.74 65.03 65.71 64.90 = Iron COnteot at Unloading Port (3) 65,1 I 65.71 65.16 Sign Of Ditl'crence ('$) 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 8.7.3 Since the calculated value is less than 1.96, it is concluded that the two methods of sampling are not significantly different at 5 percent level of significance. Going by the p-value approach also, here since 65.44 63,73 65.16 64.46 64,25 66,22 65,17 65,44 66.29 65.72 63.13 64.79 65.25 65.21 64.10 65.77 64.83 65.25 65,21 65.94 65.I35 64.66 65.34 64.46 64.09 65.16 64.14 64,54 63.73 65.17 65.44 66.29 63. !3 65.1 I 64.25 65,65 65.72 . () I ~=2x­11 -------- = ­0.97, and Jl P (Z< ­ 0.97) = 0.1660 conclusion is reached. 8.8 Power Efficiency The power efficiency of the sign test as compared to l-test is about 95 percent for sample size (n) = 6, but it declines as the sample size increases to an asymptotic efficiency of 63 percent. 9 WILCOXON TEST MATCHED-PAIRS SIGN-RANK is larger than 0.05, the same + + + 0 + i -- . + t + + -- -- + 9.1 The sign test utilizes information simply about the direction of differences within pairs. But this test is based on the relative magnitude as well as the direction of the differences, that is, it gives more weight to a pair which shows a large difference than to a pair which shows a small difference. 9.2 For any matched pair, the difference between the two observations d is calculated. Such d's are ranked withou( regard to sign, that is, a rank of 1 is given to (he smallest d, the rank of 2 to the next smallest and so on. Thus a difference of ­1 will have a lower rank than a difference of either +2 or ­2. Then the sign of the difference is assigned to each rank, that is, it is indicated as to which of the ranks are arising from the negative d's and which ranks are from positive d's. 9.3 If the difference between any pair is zero, that pair is dropped from the analysis and the sample size (n) is thereby reduced. It may also be possible that a tie may 7 -+ + + . 8.7.1 NLdl Hypothesis (H,) The null hypothesis (H<))amiA lternative Hypothe.ri.s (HO) is that there is no difference IS 6200 (Part 4) :2008 occur, that is, two or more pairs may have same numerical value of difference. The rank assigned in such cases is the average of the ranks which would have to be assigned if the d's had differed slightly. For example. three pairs may have the value of d as ­ 1, ­ 1 and +1. In this case each pair would be assigned the rank of 2. 1+2+3=2 because the average of the ranks is = -- 3 Then the next din order would receive the rank of 4 because the ranks 1, 2, 3 have already been used. 9.4 Under the null hypothesis it is expected that the sum of the ranks having a plus sign and that having a minus sign should be equal. Therefore, if the sum of ranks of positive sign is very much different from that of negative sign, it is expected that there is a significant difference and the null hypothesis should be rejected. Table 8 Differences Source (1) 1 Difference Table 7 Yield of Waxy Substance Tobacco Leaves (Clause 9.6) Source (1) Solvent A (2) from Solvent B (3) 1 ~ 3 4 5 6 7 8 9 2.3 3.2 2.5 4,8 4.2 2.8 3.6 4.6 3.9 4.5 3.0 2.7 2.8 4,3 5.2 4.0 3.6 3.2 4,8 5,8 10 9.6.2 The difference given in Table 8. d and the signed ranks of d are 9.5 Small Samples This method shall be employed when the number of pairs (n) is less than or equal to 25. Let Tbe the smaller sum of like signed ranks, that is, T is either the sum of the positive ranks or the sum of the negative ranks, whichever is smaller. The value of T is calculated from a sample of n pairs and compared with the critical value for sample size n and desired level of significance. Depending upon the alternative hypothesis being twosided or one-sided, the appropriate critical value may be chosen from Annex D. If the critical value is greater than the calculated value of T, th~n the null hypothesis is rejected, otherwise not. Alternatively, if the calculated p-value of the test-statistic assuming, under FfO,the observed value and more likely values favouring H,, is less than the chosen level of significance et, Ho is rejected. 9.6 Example 5 The yield of waxy substance from tobacco leaves is supposed to depend upon the solvent used for extraction. For each of the ten different sources from which leaves were obtained, two extractions, using solvent A and solvent B were carried out. The results are given in Table 7. It has to be tested whether the two solvents produce significantly different estimate of wax contents. 9.6.1 Null Hyr]othesis (Ho) andAlternative (H,) The null hypothesis is that there is no difference between the estimates of wax content by two different solvents against an alternative hypothesis (Hl) that there is a significant difference between the estimates. Hypothesis and Signed Ranks Rank with Less Frequent Sign (4) Rank of d (3) D = A­B (2) 2 3 4 5 6 7 8 9 10 ­0.7 0.5 -0.3 0.5 -1.0 -1.2 0 1.4 -0.9 ­1.3 4 2.5 ­1 2.5 -6 -7 -- 9 -5 -s 2.5 2.5 9 Total 14 9.6.3 Since for the source 7, the difference is zero, this pair is dropped from the analysis. So the sample size (n) will be reduced to 9. Since for the sources 2 and 4 the same difference 0.5 is obtained, their rank would be (2+3)/2 = 2.5 each. The sum of the positive and negative ranks is 14 and 31 respectively. The smaller of the values, that is, 14 is chosen as T. From Annex D, for a two-sided test, the critical value (for n = 9 and 5 percent level of significance) is 6. Since the calculated value is greater than the critical value, the null hypothesis that the two solvents do not produce significantly different estimate of wax content is not rejected. Using a software package, the 2-tailed p-value comes out as 0.343 > 0.05 so acceptance of Ho is further confirmed. 9.7 Large Samples When the samples size is more than 25, the smaller sum of the like signed ranks, T is approximately normally distributed with: 8 IS 6200 (Part 4) :2008 9.8.2 Since the calculated value of Z is greater than 1.645 (corresponding to 5 percent level of significance) or 2.325 (corresponding to 1 percent level of s~gniflcance), the null hypothesis that there is no difference in the rate of consumption of gasoline after the application of additive is rejected at 1 percent level of significance. Using thep-value approach, too, since P (Z <-4. 0<0.01, the same conclusion is upheld. h. 24 is approximately normally distri uted with mean zero and variance 1. The value of \ is calculated and compared with critical value of 1.96 (corresponding to 5 percent level of significance) or 2.58 (corresponding to 1 percent level of significance), for a two-sided test. For one-sided test, the calculated value is compared with critical value of 1.645 (corresponding to 5 percent level of significance) or 2.325 (corresponding to 1 percent level of significance). `The null hypothesis is rejected if the calculated value of Z is greater than the critical value, otherwise not. Alternatively, if the calculated p-value of the test-statistic assuming, under Ho, the observed value and more likely values favouring H,, is less than the chosen level of significance cz, HOis rejected. 9.8 Example 6 Table 9 gives the rate of consumption of gasoline (km/l) by 30 motorcycles before and after the application of newly deviloped gasoline additives. It has to be examined whether there is any significant difference in the rate of consumption of gasoline after the application of the additive. 9.8.1 NLdl Hypothesis (HJ The null hypothesis is that there is no difference in the rate of consumption of gasoline after the application of the additive against an alternative hypothesis (Hl ) that the additive reduces the rate of consumption (one-sided test). Let the distance travelled after application of additive (y) minus th~ distance travelled before application of additive (x) be denoted by d. Then under alternative hypothesis HI, many of the values of d will be positive and therefore, the sum of positive ranks will be much larger than the sum of negative ranks. Therefore, T= Smaller sum of like signed ranks= 29.5 295_30x31 (HO) and Alternative Hypothesis Table 9 Rate of Consumption of Gasoline (km/l) (Clause 9.8) Motor Cycle No. (1) 1 2 3 4 5 6 7 8 9 10 Variance = n(n+l)(2n+l) 24 lT-n(n+l)l 18) E Therefore, Z = Distance (km/I) Difference Sign Rank `Without WitIi' Additive(x) Additive (j) (2) 27.2 31.6 29.8 29.1 32.0 28.7 30.3 28.3 30.1 (3) A d=y­x (4) (5) 28.3 30.8 30.9 31.2 32.7 28.6 31.9 28.9 30.4 28.9 30.1 32.0 30.1 30.4 30.9 29. I 28.2 27.7 28.9 29. I 30.2 31.1 31.5 30,0 31.6 32.5 29.3 29.8 31.9 32.8 1.1 -0.8 1.6 2.1 0.9 -0,1 1.6 0,6 0.3 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 27.S 29.3 30.4 28.6 29.5 29.9 28,2 27.4 27.5 28.4 27.8 29.2 29.9 32.8 28.7 30.8 31.1 27.8 28.6 30.2 31.3 1.1 0.8 1.6 I.5 0.9 1.0 0,9 0,8 0,2 0.5 I.3 1.0 1.2 -1.3 I.3 0.8 I ,4 I ,5 1,2 I .7 I.5 I'otal +15.5 -8.5 +2-1 +30 +6 -1 +27 +5 +3 +15.5 +8.5 +27 +24 +11.5 +13.5 +11.5 +8,5 +2 +4 +20 +13.5 +17.5 -20 +20 +8.5 +22 +24 +17.5 +29 +24 +435.5 -29,5 9.9 Power Efficiency When all the assumptions of t-test are met, the power efficiency of Wilcoxon matched-pairs sign-rank test as compared to t-test is around 95 percent. 10 MEDIAN TEST 10.1 The median test is used to test whether two independent samples have been drawn from the populations with the same median. The null hypothesis (Ho) is that the two samples are from populations with the same median. The alternative hypothesis (Hl) may be that the median of one 9 `= & 24 ""18 IS 6200 (Part 4) :2008 population population is different from (two-sided test). that of the other steel tubes of nominal outside diameter 60 Table 11 gives the outside diameter of steel randomly selected from both the machines. whether the two machines are manufacturing tbe of same diameter. Table 11 Outside Diameter S1 No. Total 1 Machine 1 Combined Rank S[ No. 10.2 Two samples of sizes nl and rq drawn from the two populations are pooled and the median is obtained for the combined sample of size n] + nz = n. The data is arranged in a 2 x 2 table as given in Table 10. Table 10 2 x 2 Contingency Table mm. tubes Test tubes of Steel lbbes, Machine 11 mm Combined Rank Sample 1 Sample 11 No. of observationsequal to or greater than combined median a b a+b No. of observationsless c than combined median Total: nl = a+c d n2= b+d c+d n = n, + n2 10.3 If both the samples are drawn from the populations with the same median, it is expected that about half of each s~rnple observations will be above the combined median and about half will be below, that is, it is expected that frequencies a and c would be almost same and so also frequencies b and d. 10.4 Large Samples When the total number of observations in both the samples combined is more than 20, a X2-test for 2 x 2 contingency table as given below may be used for testing the null hypothesis: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 59.2 60.2 59.3 60.5 60.5 60.6 60.3 59.3 59.6 60.3 60.4 59.7 60.4 60.2 59.8 1 18.5 2.5 24.5 24.5 26 20.5 2.5 7 20.5 22.5 9 22.5 18<5 12.5 1 2 3 4 5 6 7 8 9 10 11 12 59.4 59.5 60.7 59.8 59.8 59.8 59.7 60.0 59,7 59.4 60.1 60.0 4.5 6 27 12.5 12.5 12.5 9 15.5 9 4.5 17 15.5 10.5.1 The values from both samples when combined and arranged in ascending order, the combined median comes out as 59.8. The data is arranged in 2 x 2 contingency table, as shown in Table 12. Table 122 x 2 Contingency Machine 1 Table Machine II Total No. of observations greater than or equal to combined median 13 ;) (:) (j 14 (i) is distributed as %2with one degree of freedom also IS 6200 (Part 2)]. [see No. of observations less than combined median Total : n,= 15 fr2=12 n=27 10.4.1 The value of %2for one degree of freedom is calculated and compared with critical value for 5 percent level or 1 percent level of significance for a two-sided test, as given below: Critical Values for Upper Tail and Lower Tail with Equal Area Si,gnijkance Level, ci 0.05 0.01 Lower Tail 0.00098 0.000039 Upper Tail 5.02 7.88 The calculation for X2may also be shown as: n[lad-bc\-n/2_f `2= (a+ b)(c+d)(b+d)(a+c) [27(24 -72) -27/2~ -- 14x13x12x15 27X (34,5)2 = * 981 32760 " The null hypothesis is rejected if the calculated value is greater than the critical value of upper tail or smaller than the critical value of the lower tail, otherwise it is not rejected. 10.5 Example 7 In a factory, two machines were used for manufacturing 10 -- 10.5.2 Since the calculated value of X2is less than 3.84, the tabulated value at 5 percent level of significance, the null hypothesis that both the machines are manufacturing tubes of same diameter is not rejected. IS 6200 (Part 4) :2008 Using a software package, the p-value comes out as 0.256>0.05, so that acceptance of HO is upheld. 10.6 Power Efficiency The power efficiency of a median test as compared to t-test is about 95 percent for nl + nz = 6. The power efficiency decreases as the sample size increases, reaching an asymptotic efficiency of 63 percent. 11 MANN-WHITNEY U TEST u, + U2= n, 112 11.3.2 The smaller of Ut and Uz is taken as the value of u. 11.4 Small Samples The following method shall be employed when nz, the number of observations in the larger of two independent samples is less than or equal to 20. 11.4.1 The value of U as calculated in 11.3.2 is compared with the critical value for a given n,, n2 and desired level of significance. The critical values for this purpose are given in Annexes E and F (for one-sided test) and Annexes G and H (for two-sided test). The null hypothesis shall be rejected if the calculated value of U is less than the critical value, otherwise not. Alternatively, if the calculated p-value of the test-statistic assuming, under HO, the observed value and more likely values favouring H,, is less than the chosen level of significance a, HO is rejected. 11.5 Example 8 Tests were conducted by a consumer organization on two types of flares, Super-flash and Britalite. TabIe 13 gives burning time (in rein) for 12 flares of each make. It has to be tested whether Super-flash has more burning time. Also compare its conclusion with parametric t-test. Table 13 Burning Time, min (Clauses 11.5 and 11.5.4) S1 No. (1) Super-Flash (:) Combined Rank (3) Britalite (;) Combined Rank (5) 11.1 The Mann-Whitney U testis used to test whether two independent samples have been drawn from the same population. This is one of the most powerful nonparametric tests and is most useful alternative to the parametric t-test. 11.2 Suppose two samples are drawn from two populations A and B. The null hypothesis is that both the populations are identical. The alternative hypothesis may be that the location parameter of one population is larger (or smaller) than the location parameter of the other population, that is, the bulk of the distribution of one population is to the right (or to left) of the bulk of the distribution of the other (one-sided test). For a two-sided test, the alternative hypothesis is that the two are not identical. 11.3 To apply this test, let n, = the number of observations in the smaller of two independent samples, and nz = the number of observations in the larger of two independent samples. 11.3.1 The observations from both the samples are combined and arranged in non-descending order with the identity of the samples preserved. The ranks are given in order of increasing size. In this ranking algebraic size is considered, that is, the lowest rank is assigned to the largest negative number if any. In case a tie occurs, each of the tied observations is given the average of ranks which they would have had if the values had differed slightly. Calculate the sum of ranks assigned to a sample with n, observations (say R,). Similarly find the sum of ranks assigned to a sample with n~ observations (say R2). Two values U, and Uz are calculated by the following relations: 1 2 3 4 5 6 7 8 9 10 11 ~~ 20.9 I9.3 19.6 23.3 21.2 22.4 14.2 16.5 16.7 I7.3 15.2 21.4 19 17 18 23 20 22 4 12 13 14 5 21 15.9 10 15.5 17.4 18.0 13.9 15.6 15.8 13.4 10.1 24.3 15.7 16.4 6 15 16 3 7 9 2 1 24 8 11 Hypothesis U, =n,n, +n'(n' 2 `l)-RI 11.5.1 Null Hypothesis (HO)andAlternative (H,) 2 n, [n, +1) _R Uz = n, n, + NOTES 1 In fact U, is the number of times that an observation in a sample of size nz precedes an observation in a sample of size rIl, Similarly f/l may be defined. 2 The values of U, and Uz as calculated above may also be 2 The null hypothesis is that the two types of flares do not differ in the burning time against an alternative hypothesis (Hl) that Super-flash has more burning time. 11.5.2 The sequence of observations of the two samples when combined and arranged in non-descending order is given by: 11 verified by the following relation: IS 6200 (Part 4) :2008 BBBAABBBBB Ranks Ranks 123456789101112 AA BBAAAAA AA B U ­ nl nz where A denotes an observation from Super-flash B denotes an observation from Britalite. Here R, = sum of ranks of observations flash = 188 and Therefore, Z = * is standardized normal variate. The value of Z is calculated and compared with the critical value as given in 8.6.1. Alternatively, if the calculated p-value of the test-statistic assuming, under HO, the observed value and more likely values favouring HI, is less than the chosen level of significance U, HO is rejected. 11.7 Example 9 In an experiment, 45 mentally retarded sub-normal patients with behaviour disorders were randomly divided into two groups of sizes 22 and 23 respectively. Those, in Group B were given inert tablets whereas those in Group A were treated with a tranquilizer. At the end of the period of treatment, all the patients were rated on the Claridge exi~ability rating scale, on which the highest score corresponds to the most distinguished behaviour. It is desired to test whether the tranquilizer is more effective in improving the patient's behaviour. The scores are as follows: Group A 84 141 224 72 I 54 218 91 137 209 Ill 238 147 I93 96 I 54 210 119 178 182 160 99 114 Combined Rank !4 31 44 8 33,5 43 17 30 41 21 45 32 40 18 33.5 42 25 37 38 35 19 '22 23 BA Variance = nl nz (n] +nz +1) 12 13 1415 16 17 18 19 20 21 22 23 24 from Supcr- R2 = sum of ranks of observations Britalite = 112 from 1~~13 u, =i44+--­ 2 = 222- 188 188=34 Similarly, 12x13 ­112 U, == 144+ -- 2 =222­112=110 Therefore, U = Minimum of U, and Uz = 34 11.5.3 The critical value (one-sided test) for n, = 12, n2 = 12 and 5 percent level of significance is 42 (see Annex E). Since the calculated value of U is less than the critical value, the null hypothesis that there is no difference in the burning time for both types of flashes is rejected. 11.5.4 Applying we get: t-test to the data given in Table 13, S1 No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 44 for Super-flmh n, = 12, mean (X) = 19, and for Britalite n, = 12 and mean (~) = 16 alsos, = qx-q'+z(y-j7)2 nl +n2 ­2 Therefore, t = s[l/n, (Y-y) +l/n2]y2 _(19-16)(6~=233 3.15 " = 9.92 The critical value (one-sided) oft for 5 percent level of significance and 22 degrees of freedom is 1.717. As the calculated value is greater than the critical value, the null hypothesis is rejected, thereby arriving at the same conclusion as by Mann-Whitney (J test. 11.6 Large Samples (nl, nz Larger than 20) As n, and n2 increase in size, the distribution approaches the normal with: of U Sample B Rank 82 70 76 118 100 I74 135 88 78 128 74 58 135 185 46 41 71 135 116 83 69 86 Combined 12 6 10 24 20 36 28 16 II 26 9 4 28 39 3 I 7 28 23 13 5 15 2 11.7.1 Null Hypothesis (Ho) and Alternative Hypothesis (H,) Mean (X) = ~, and The null hypothesis is that both the treatments are equally effective against an alternative hypothesis (Hl) 12 IS 6200 (Part 4) :2008 that the tranquilizer is more effective the patient's behaviour. in improving 12.2 Two samples of sizes n, and nz drawn from two populations are pooled and the observations of both the samples are arranged in increasing order. The total number of runs is determined. A run is defined as any sequence of observations from the same sample. For example, the following order of sequence of size nl + nz = 10 may be observed: ABA from Group r ABBBABB 11.7.2 The sequence of the observations of the two samples, when combined and arranged in the ascending order, the following sum of ranks were obtained: R] Rz = sum of ranks of observations A = 669 = sum of ranks of observations B = 366 from Group = Number of runs of A -t number of runs of B= 3+3=6 Also n, = 22 and n~ = 23 U, = n, n,+ 'zl(n'+ ')-R, 2 = 506+ 253 ­ 669 =90 12.2.1 The number of runs may also be calculated by noting down the number of transitions from A to B or from B to A and using the following relation: Total number of runs (r) = Number of transitions + 1 and Uz= n1t12-U1 =416 Therefore, U = Minimum }1, (U,, U2) = 90 nl Mean = -- = 253 2 Variance = nln?(l+"2+')=]93967 12 44.04 Standard deviation = ~== 12.3 It is also possible that ties may occur. If ties are within the same sample, then there is no problem, as tbe number of runs is not affected. But if there be ties among observations from both the samples, one may not get a unique value of r. In that case, one has to break ties in all possible ways and find the corresponding values of r. If all these different r's lead to the same conclusion at the desired level of significance, there is no problem. In case different values of r lead to different decisions, we accept the largest among these values as a conservative approach (that is, an approach that rejects H,, rather cautiously). X sample : Y sample : Combined Seqaence 190-2531 = j To z= 44.04 " 17, 18, 19, 19 16, 19, 19 16 17 18 19 19 19 19 Value oj'r Since the calculated value of Z is greater than 2.325, the null hypothesis is rejected thereby implying that the tranquilizer is more effective in improving patient's behaviour at 1 percent level of significance. The conclusion is upheld by the p-value well, since F'(Z< ­3.70) = O <0.01. 11.8 Power Efficiency The power efficiency of Mann-Whitney compared to r-test is about 95.5 percent. 12 WALD-WOLFOWITZ RUN TEST U test as approach as Different ways of 1YXXXXYY3 II YXXXYXY5 breakingties No. of ways Y~!=6 111 YXXXYYX4 WY XX YXYX6 VYXXYXXY5 VI YXXYYXX4 Here we take r = 6 However, if the number of ties across the samples is large, the run test is not recommended. 12.4 If the two samples are drawn from the same population, that is, if HO is true, then the observations of A's and the B's will be well mixed. In that case r, the number of runs will be relatively large. Therefore if H() is true, the value of r will be large and if HOis false, the value of r will be small. 12.5 Small Samples This method shall be employed when n, and n2 is less than or equal to 20. The number of runs (r) is calculated by the method given in 12.2 and this value of r is compared with the critical value of r for a given n,, nz 13 12.1 The Wald-Wolfowitz run test is used to test the null hypothesis that the two independent samples have been drawn from the same population against an alternative hypothesis that the two populations differ in any respect whatsoever. The two populations may differ in central tendency, variability, skewness, etc. Thus this test may be used to test a large number of alternative hypothesis whereas many other tests are applicable to a particular type of difference between the two populations (for example, the median test determines whether the two samples have been drawn from two populations with the same median). IS 6200 (Part 4) :2008 and desired letiel of significance. The critical values of r are given in Annex J for 5 percent level of significance and in Annex K for 1 percent level of significance. The null hypothesis is rejected if the calculated value of r is less than the critical value, otherwise not. Alternatively, if the calculated p-value of the test-statistic is less than the chosen level of significance et, HOis rejected. 12.6 Example 10 ~= ,z(x-q'+qy-j7)2 nl+nz­2 n, S? + n2S~ = nl+n2­2 S2 = 148.57 S = 12.19 1%-~( Therefore, t= S[l/n, +l/n21% `=x 244 The members of consumer association investigated two brands of canned peas selling at the same price for the same size of can. A random selection of 5 cans of brand A and 7 cans of brand B were made. The drained weight (in g) of the cans is given in Table 14. It is desired to test the hypothesis whether both the brands are equally good with regard to the net content. Table 14 Drained Brand 297 292 312 307 317 [1 E 35 % `0"34 Weights of Cans, g A Brand B 280 308 311 293 314 316 296 12.6.4.1 The tabulated value oft at 5 percent level of significance (two-sided test) is 2.23. Since the calculated value is less than the tabulated value, the null hypothesis that there is no significance difference between the two brands is not rejected, thereby leading to the same conclusion as by Wald Wolfowitz-run test. 12,7 Large Samples When either nl or n2 is greater than 20, then the number of runs is approximately normally distributed with: 2nln2 = --+1, n, + n2 Mean (P,) Hypothesis and 12.6.1 Null Hypothesis (HO) andAltemative (H,) 2nln2 (2nln2 ­n[ ­n2) Variance (cr~ ) = (n, +n2)'(n, +n2-1) The null hypothesis is that both the brands are equally good with regard to the net content. 12.6.2 The observations from both the brands when pooled and arranged in ascending order, the following sequence is obtained: BABBAABBABBA So, r = Number of runs = 8 Therefore, Z = Ir-Frl or is a standardized variate. normal 12.6.3 The critical value of r for nl = 5, nz = 7 and 5 percent level of significance is 3. Since the calculated value of r is greater than the critical value, the null hypothesis is not rejected. 12.6.4 Comparison For brand A: Mean (Y) = 305.0 Variance (S,2) = 86.00 For brand B: Mean (7) = 302.6 with t-test 12.7.1 When nl + n2 is less than 30 (with either n, or rq more than 20), the continuity correction should be applied in Z by subtracting 0.5 from the absolute difference of (r ­ ~,). Thus in this case value of Z will be given by: z = lr-~,1-O.5 0, Variance (S22) = 150.82 12.7.2 The value of Zis calculated and compared with the critical value as given in 8.6.1. The null hypothesis is rejected if the calculated value of Z is greater than the critical value, otherwise not. Alternatively, if the calculated p-value of the test-statistic assuming, under HO,the observed value and more likely values favouring HI. is less than the chosen level of significance CX, HO is rejected. 14 IS 6200 (Part 4) :2008 ANNEX A (Ckwe CRITICAL 6.3) ONE-SAMPLE TEST VALUES OF D IN THE KOLMOGOROV-SMIRNOV Sample Size (n) 1 2 3 4 Level 5 Percent 0.975 0.842 0,708 0.624 of Significance 1 Percent 0.995 0.929 0.828 0.733 I 5 I 0.565 I 0.669 6 7 8 0.521 0.486 0.457 0.618 0.577 0.543 I 9 10 I I 0.432 0.410 0.391 0.375 0.361 0.349 0.338 0.328 0.318 0.309 0.301 0.294 0.27 0.24 0.23 1.361~ I 0.514 0.490 I 12 13 14 15 16 17 18 19 20 25 30 35 Over 35 I 0.468 0.450 0.433 0.418 0.404 0.392 0.381 0.371 0.363 0.356 0,32 0.29 0.27 1.63/d 15 IS 6200 (Part 4) :2008 ANNEX B (Clauses 7.7 and 7.8.3) CRITICAL Sample Size 01) 3 4 5 6 7 8 9 10 11 12 13 14 Is 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 35 40 VALUES OF KD IN THE KOLMOGOROV-SMIRNOV One-Sided Test fc)r Level of Sigtlifkance I Percent -- -- 5 6 6 6 7 7 8 8 8 8 9 Tw-Sided TWO-SAMPLE TEST Test for Level of Significance 1 Percent -- -- 5 6 6 7 7 8 8 8 9 9 9 5 Percent 3 4 4 5 5 5 6 6 6 6 7 7 7 7 8 8 8 8 8 9 9 9 9 9 9 10 10 10 11 11 5 Percent -- 4 5 5 6 6 6 7 7 7 7 8 8 9 9 10 10 10 10 11 11 11 11 11 12 12 12 12 13 14 8 8 9 9 9 9 9 10 10 10 10 10 11 II 11 12 13 10 10 10 10 11 11 11 11 12 12 12 12 13 13 13 14 15 16 IS 6200 (Part 4) :2008 ANNEX C (Clauses 8.4 and 8.5.2) CRITICAL Sample Size (n) 5 6 7 8 9 10 11 1213 14 15 16 17 18 19 20 21 22 23 24 25 One-Sided VALUES OF X IN THE SIGN TEST Two-Sided Tesifor Level of Significance 1 Percent -- Test for Level of Sigruyicance 1 Percent -- -- n u 0 0 0 1 1 1 2 2 3 3 3 4 4 4 5 5 5 6 5 Percent 0 0 0 1 1 1 2 2 3 3 3 4 4 5 5 5 6 6 7 7 7 5 Percent -- o u 0 1 1 1 2 2 2 3 2 0 0 0 0 1 1 1 2 J I -i L 2 3 3 3 4 4 4 5 5 I 4 4 4 5 5 5 6 6 7 17 IS 6200 (Part 4) :2008 ANNEX D (Clauses 9.5 ad 9.6.3) CRITICAL Sample Size (n) 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 VALIJES OH Z IN THE WILCOXON One-Sided MATCHED-PAIRS Two-Sided SIGN-RANK TEST Test fbr Level qf Sign I~icance 1 Percent --. 0 2 3 5 7 10 13 16 20 24 28 33 38 43 49 56 62 69 77 Test for Level of Significance 1 Percent -- . 0 2 3 5 7 10 13 16 20 23 28 32 38 43 49 55 61 68 5 Percent 2 3 5 8 10 13 17 21 25 30 35 41 47 53 60 67 75 83 91 100 5 Percent o 2 4 6 8 11 14 17 21 25 30 35 40 46 52 59 66 73 81 89 18 IS 6200 (Part 4) :2008 ANNEX E (Clau.w 11.4.1 CRITICAL and 11.5.3) FOR 5 PERCENT VALUES OF U IN MANN-WHITNEY U TEST (ONE-SIDED) LEVEL OF SIGNIFICANCE 4 ­ () 1 2 3 4 5 0 6 () 7 8 ­ 9 lo 11 ­ 12 13 2 6 14 15 16 11,/112 L 1 2 ?! 17 , 3 a 9 15 20 26 I 18 19 I 20 I 2 3 4 5 6 7 ­ ­ - - () () -1-1 2 1 7 3 7 12 18 23 28 1 -1-1 , 3 g 14 19 25 -1o1o # 4 9 16 22 28 [ 4 10 17 23 30 o 2 4 6 /? 11 1 3 1 3 6 9 12 Is 1 4 7 II 14 17 20 24 27 31 34 37 41 44 48 51 55 58 62 1 5 8 12 16 19 23 27 31 34 38 42 46 50 54 57 61 65 69 2 5 9 13 17 21 26 30 34 38 42 47 51 55 60 64 68 72 77 4 1] 18 25 32 ? 1 2 4 5 6 2 3 5 7 8 5 8 10 13 10 II 15 19 24 28 33 37 42 47 51 56 61 65 70 75 80 84 16 21 26 0 0 () 1 2 2 30 36 42 48 54 60 65 7] 77 83 89 95 101 33 39 4s 51 57 64 70 77 83 89 96 102 35 41 48 55 61 68 75 82 88 95 102 109 116 123 37 44 51 58 65 72 80 87 94 101 109 116 123 130 39 47 54 62 69 77 84 92 x 9 1() II 12 13 14 15 16 17 18 19 20 ­ ­ -­ ­ 0 () 1 1 1 I 2 2 2 3 3 3 4 4 4 3 3 4 5 5 6 7 7 8 9 9 10 11 5 6 7 8 9 8 9 11 12 13 15 16 18 19 20 22 23 25 1() 12 14 16 17 19 21 23 25 26 28 30 32 13 IS 17 19 21 24 26 28 30 33 35 37 39 [5 18 20 23 26 28 31 33 36 39 41 44 47 18 21 24 27 30 33 36 39 42 45 48 51 54 31 36 41 46 51 56 61 66 71 77 82 87 92 33 39 44 50 55 61 66 72 77 83 88 94 100 10 11 12 14 15 16 17 18 100 107 115 123 130 138 109 115 107 19 IS 6200 (Part 4) :2008 ANNEX F (Clause 11.4.1) CRITICAL VALUES OF U IN MANN-WHITNEY U-TEST (ONE-SIDED) LEVEL OF SIGNIFICANCE 4 5 6 7 8 9 10 FOR 1 PERCENT nll 112 1 ­ - 2 3 11 - 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 ­ -- ­ 0 ­ - o - - - - - ­ - - - - - - - - - 0 2 5 8 2 5 9 12 16 0 2 6 0 3 7 () 3 7 12 16 21 o 4 8 13 18 23 () 4 9 14 19 24 1 4 9 15 20 26 1 5 10 16 22 28 34 40 0 1 2 3 4 1 3 4 6 0 2 4 6 7 1 3 5 7 9 1 3 6 8 11 I 4 7 9 12 0 1 1 2 10 13 17 11 15 19 11 14 1 2 3 3 3 4 5 6 0 1 1 1 2 2 2 3 3 4 4 4 5 6 7 8 7 9 9 11 }1 14 13 16 1s 18 17 21 20 23 22 26 24 28 26 31 28 33 30 36 32 38 9 10 - ­ 11 12 14 16 17 19 21 23 24 26 28 13 15 17 20 22 24 26 28 30 32 34 16 18 21 23 26 28 31 33 36 38 40 19 22 24 27 30 33 36 38 41 44 47 22 25 28 31 34 37 41 44 47 50 53 24 28 31 35 38 42 46 49 53 56 60 27 31 35 39 43 47 51 55 59 63 67 30 34 38 43 47 51 56 60 65 69 73 33 37 42 47 51 56 61 66 70 75 80 36 41 46 51 56 61 66 71 76 82 87 38 44 49 55 60 66 71 77 82 88 93 41 47 53 59 65 7(3 76 82 88 94 100 44 50 56 63 69 75 82 88 94 101 107 47 53 60 67 73 80 87 93 100 107 114 11 12 13 14 15 16 17 18 19 20 ­ ­ ­ ­ ­ ­ ­ 0 0 0 0 0 0 1 1 4 5 5 6 7 7 8 9 9 10 7 8 9 10 11 12 13 14 15 16 9 11 12 13 15 16 18 19 20 22 20 IS 6200 (Part 4) :2008 ANNEX G (clam? 11 .4.1) CRITICAL VALUES OF U IN MANN-WHITNEY U-TEST (TWO-SIDED) OF SIGNIFICANCE 3 ­ o 1 1 ­ ­ 0 1 2 3 ­ o 1 2 3 5 4 5 1 2 3 5 6 6 I 3 5 6 8 7 ­ 8 ­ 0 2 4 6 8 10 9 FOR 5 PERCENT LEVEL 1 111/ I 2 ­ ­ ­ () 112 1 2 3 4 5 6 7 8 ­ - I 10 II -- 0 3 5 8 11 14 I 12 13 14 [5 16 17 18 19 20 I -- o 2 4 7 10 12 -- 0 3 6 9 13 16 -- 1 4 7 11 14 18 -- 1 4 8 12 16 20 24 28 33 37 41 . 1 5 9 13 17 22 26 31 36 40 45 -- 1 5 10 14 19 24 29 34 39 44 49 -- I 6 11 15 21 26 31 37 42 47 53 -- 2 6 11 17 22 28 34 39 45 51 57 -- 2 7 12 18 24 30 36 42 48 55 61 . 2 7 13 19 25 32 38 45 52 58 65 -- 2 8 13 20 27 34 41 48 55 62 69 76 83 90 98 105 112 119 127 2 2 3 3 4 4 5 5 6 6 7 7 8 4 4 5 6 7 8 9 10 II 11 12 13 13 6 7 8 9 II 12 13 14 15 17 18 19 20 8 10 11 13 14 16 17 19 21 22 24 25 27 10 12 14 16 18 20 22 24 26 28 30 32 34 13 15 17 19 22 24 26 29 31 34 36 38 41 15 17 20 23 26 28 31 34 37 39 42 45 48 17 20 23 26 29 33 36 39 42 45 48 52 55 19 23 26 30 33 37 40 44 47 51 55 58 62 22 26 29 33 37 9 ­ o 0 0 1 1 1 1 1 2 2 2 2 10 11 12 - 13 ­ 14 15 ­ 16 17 - 41 45 49 53 57 61 65 69 45 50 54 59 63 67 72 76 50 55 59 64 67 74 78 83 54 59 64 70 75 80 85 90 59 64 70 75 81 86 92 98 63 67 75 81 87 93 99 105 67 74 80 86 93 99 106 112 72 78 85 92 99 106 113 119 18 ­ 19 ­ 20 ­ 21 IS 6200 (Part 4) :2008 ANNEX H (Ckwse 11.4.1) CRITICAL VALUES OF U'IN MANN-WHITNEY U TEST (TWO-SIDED) OF SIGNIFICANCE 3 ­ ­ ­ ­ ­ 0 4 ­ o o 1 1 5 ­ 0 1 1 2 o 1 2 3 4 6 ­ 0 1 3 4 6 7 ­ 8 ­ 1 2 4 6 7 9 0 1 3 5 7 9 10 ­ 0 2 4 6 9 11 11 0 `2 5 7 10 13 12 ­ 1 3 6 9 12 15 13 ­ 1 3 7 10 13 17 14 ­ 1 4 7 11 15 18 15 ­ 2 5 8 12 16 20 FOR 1 PERCENT LEVEL n,hq 1 2 3 4 5 6 7 8 9 1 2 ­ ­ ­ ­ ­ -- 16 ­ 2 5 9 13 18 22 17 ­ 2 6 10 15 19 24 18 ­ 2 6 11 16 21 26 19 ­ 0 3 7 12 17 22 28 20 ­ () 3 8 ]3 18 24 30 36 42 48 54 60 67 73 79 86 92 99 105 3 4 5 6 7 7 8 9 10 11 12 13 5 6 7 9 10 11 12 13 15 16 17 18 7 9 10 12 13 15 16 18 19 21 22 24 9 11 13 15 17 18 20 22 24 26 28 30 11 13 16 18 20 22 24 27 29 31 33 36 ]3 16 18 21 24 26 29 31 34 37 39 42 16 18 21 24 27 30 33 36 39 42 45 48 18 21 24 27 31 34 37 41 44 47 51 54 20 24 27 31 34 38 42 45 49 53 57 60 22 26 30 34 38 42 46 50 54 58 63 67 24 29 33 37 42 46 51 55 60 64 69 73 27 31 36 41 45 50 55 60 65 70 74 79 29 34 39 44 49 54 60 65 70 75 81 86 31 37 42 47 53 58 64 70 75 81 87 92 33 39 45 51 57 63 69 74 81 87 93 99 10 11 12 13 14 15 16 17 18 19 20 ­ - ­ ­ ­ ­ 0 0 0 0 1 1 1 2 2 2 2 3 3 2 2 3 3 4 5 5 6 6 7 8 22 IS 6200 (Part 4) :2008 ANNEX J (Clause 12.5) CRITICAL VALUES OF r IN WALI)-WOLFOWITZ RUN TEST FOR 5 PERCENT SIGNIFICANCE n,l n7 2 3 4 5 6 7 8 9 10 II 12 13 14 15 16 17 18 19 Z() 2 2 2 2 2 2 2 2 2 2 3 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 4 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4 5 2 2 3 3 3 3 3 4 4 4 4 4 4 4 5 5 5 6 2 2 3 3 3 3 4 4 4 4 5 5 5 5 5 5 6 6 7 2 LEVEL OF 8 2 9 2 1() 2 11 2 12 2 2 13 2 2 14 2 2 15 2 3 3 4 5 6 6 7 7 8 8 9 9 10 10 11 11 1} 12 16 2 3 4 4 5 6 6 7 8 8 9 9 ]0 10 11 11 11 ]2 12 ]7 2 3 4 4 5 6 7 7 g 9 9 10 10 ]1 11 11 12 ]2 13 ]8 2 3 4 5 5 6 7 8 8 9 9 10 10 11 1] 12 12 ]3 13 19 2 3 4 5 6 6 7 8 8 9 10 10 11 1] ;2 12 13 ]3 13 20 2 3 4 5 6 6 7 8 9 9 10 10 11 1'2 12 13 13 13 14 2 3 3 3 4 4 5 5 5 5 5 6 6 6 6 6 6 3 3 3 4 4 5 5 5 6 6 fj (j 6 7 7 7 7 3 3 4 4 5 5 5 6 6 6 7 7 7 7 8 8 8 3 3 4 5 5 5 6 6 7 7 7 7 8 8 8 8 9 3 4 4 5 5 6 6 7 7 7 8 8 8 9 9 9 9 3 4 4 5 6 6 7 7 7 8 8 g 9 9 9 ]0 10 3 4 5 5 6 6 7 7 8 8 9 9 9 10 10 10 10 3 4 5 5 6 7 7 8 8 9 9 9 10 10 10 11 11 23 1S 6200 (Part 4) :2008 ANNEX K (Clciuse CRITICAL ~- 12.5) I I 1 :;; 3 4 516 7 +' , `()" 1`2 `3 `4 `5 `"~`7 `8 `9'20! 13 - - - - - `- - - - 2 2 2 2 2 ~ 2 2 2 4 k 5 6 7 8 9 ~ ,1[ 1~ 13 14 115 16 17 lx [9 20 -- -- ­ 2 2 2 2 2 2 2 2 ~ -- -- 2 2 ~ 2 2 2 2 3 3 3 3 3 3 -- -- 2 2 2 2 j 3 ~ 3 3 3 3 3 4 4 4 -- ~ 2 2 3 3 -j 3 3 3 4 4 4 4 4 4 4 -- 2 2 3 3 3 ~ 4 4 4 4 4 5 5 5 5 5 2 2 3 3 3 3 4 4 4 5 5 5 5 5 6 6 6 ? 2 3 3 3 4 4 5 5 5 5 6 6 6 6 6 7 2 3 3 3 4 4 5 .5 5 5 6 617 67 7 7 7 7 7 7 8 8 2 3 3 4 4 5 ~ 5 6 6 6 2 3 3 4 4 5 5 6 6 6 7 7 7 8 8 8 8 2 3 3 4 5 5 5 6 6 7 7 7 8 8 8 9 9 2 3 4 4 5 5 6 6 7 7 7 8 8 8 9 9 9 3 3 4 4 5 6 6 7 7 7 8 8 9 9 9 10 10 3 3 4 5 5 6 (j 7 7 8 8 9 9 9 10 10 I(J ~ 3 4 5 5 6 ~ 7 g 8 8 9 9 10 lo 10 1] `j 4 4 5 6 (j 7 7 g 8 9 9 10 [0 11 11 II 3 4 4 5 6 6 7 g * 9 9 10 10 10 11 1] ]2 3 4 4 5 6 7 ~ ~ g 9 9 ]() 10 11 ]] ]2 12 VALUES OF r IN \VALI)-WOLFOWITZ RUN TEST FOR 1 PERCENT LEVEL OF SIGNIFICANCE 24 Bureau of Indian Standards BIS is a statutory institution established under the Bureau harmonious development of the activities of standardization, and attending to connected matters in the country. Copyright BIS has the copyright without the prior of all its publications. in writing No part of these publications may be reproduced in any form of Indian Standards Act, 1986 to promote marking and quality certification of goods of F31S. This does not preclude the free use, in the course of implementing the standard, of ne~essary delails, such as symbols and sizes, type or grade designations. Enquiries relating to copyright be addressed to the Director (Publications), BIS. permission Review of Indian Standards Amendments are issued to standards as the need arises on the basis periodically; a standard along with amendments is reaffirmed when needed; if the review indicates that changes are needed, it is taken should ascertain that they are in possession of the latest amendments `BIS Catalogue' and `Standards : Monthly Additions'. This Indian Standard has been developed of comments. Standards are also reviewed such review indicates that no changes are up for revision. Users of Indian Standards or edition by referring to the latest issue of from Doc : No. MSD 3 (323). Amendments Amend No. Issued Since Publication Date of Issue Text Affected BUREAU OF INDIAN STANDARDS Headquarters : Telegrams : Manaksanstha (Common to all offices) Telephone 23237617 23233841 { 23378499,23378561 { 23378626,23379120 2603843 { 2609285 22541216,22541442 { 22542519,22542315 28329295,28327858 28327891,28327892 { Manak Bhavan, 9 Bahadur Shah Zafar Marg, New Delhi 110002 Telephones :23230131, 23233375, 23239402 Regional Offices : central : Manak Bhavan, 9 Bahadur Shah Zafar Marg NEW DELHI 110002 : 1/14 C.I.T. Scheme VII M, V. 1. P. Road, Kankurgachi KOLKATA 700054 : SCO 335-336, Sector 34-A, CHANDIGARH 160022 Eastern Northern Southern : C.I.T. Campus, IV Cross Road, CHENNAI 600113 Western : Manakalaya, E9 MIDC, Marol, Andheri (East) MUMBAI 400093 Branches : AHMEDABAD. BANGALORE. BHOPAL. BHUBANES~WAR. COIMBATORE. FARIDABAD. GHAZIABAD. GUWAHATI. HYDERABAD. JAIPUR. KANPUR. LUCKNOW. NAGPUR. PARWANOO. PATNA. PUNE. RAJKOT. THIRUVANANTHAPURAM. VISAKHAPATNAM, Printed by Sunshine Graphics